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Protospacer adjacent motifs (PAMs) 
were originally characterized for 
CRISPR-Cas systems that were classi- 
fied on the basis of their CRISPR repeat 
sequences. A few short 2-5 bp sequences 
were identified adjacent to one end of the 
protospacers. Experimental and bioin- 
formatical results linked the motif to the 
excision of protospacers and their inser- 
tion into CRISPR loci. Subsequently, 
evidence accumulated from different 
virus- and plasmid-targeting assays, sug- 
gesting that these motifs were also recog- 
nized during DNA interference, at least 
for the recently classified type I and type 
II CRISPR-based systems. The two pro- 
cesses, spacer acquisition and protospacer 
interference, employ different molecular 
mechanisms, and there is increasing evi- 
dence to suggest that the sequence motifs 
that are recognized, while overlapping, 
are unlikely to be identical. In this arti- 
cle, we consider the properties of PAM 
sequences and summarize the evidence 
for their dual functional roles. It is pro- 
posed to use the terms protospacer associ- 
ated motif (PAM) for the conserved DNA 
sequence and to employ spacer acqusition 
motif (SAM) and target interference 
motif (TIM), respectively, for acquisition 
and interference recognition sites. 

Introduction 

Clustered, regularly interspaced, short pal- 
indromic repeats (CRISPR) provide a basis 
for the disparate adaptive immune systems 
that occur in most archaea and many bac- 
teria. 1 " 5 First insights into the function 
of these CRISPR arrays arose from the 



discovery that sequences of some CRISPR 
spacer regions closely matched sequences 
occurring in viruses or plasmids. This led, 
in turn, to the proposal that they partici- 
pate in defense against invading genetic 
elements. 6 " 8 This hypothesis was subse- 
quently supported by experiments show- 
ing that newly acquired spacers deriving 
from a group of bacteriophages produced 
viral immunity in strains of Streptococcus 
thermophilics. 9 '" These seminal develop- 
ments constituted a major breakthrough 
in microbiology. 

Spacers derive from fragments of invad- 
ing genetic elements termed protospacers, 
and they are incorporated into CRISPR 
loci generally, but not invariably, at repeats 
adjacent to CRISPR leaders. 914 CRISPR 
loci are transcribed from the leader and 
processed within repeats to yield small 
crRNAs, carrying most or all of the spacer 
sequence. crRNAs act as guide RNAs for 
different interference modules that target 
and cleave DNA or RNA after anneal- 
ing to the complementary protospacer 
sequence within nucleic acid of the invad- 
ing element. 15 " 20 

A DNA sequence element that is 
functionally critical for CRISPR-based 
immune systems is located adjacent to each 
protospacer. It consists of a short signature 
sequence of 2-5 bp that varies according to 
the CRISPR-based system and organism. 
This motif was first detected in sequence 
alignments of putative protospacers of 
bacteriophages that match CRISPR spac- 
ers of Streptococcus strains, 8 and subse- 
quently other diverse motifs were defined 
for a variety of organisms and different 
types of CRISPR systems. 9 ' 11 ' 18 ' 21 " 23 The 
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demonstration of the apparent universal- 
ity of the short sequence motifs adjoining 
protospacers by Mojica et al. 21 led to their 
assigning the acronym PAM for proto- 
spacer adjacent motif. 

CRISPR-based systems have recently 
been reclassified into three main types, 
I, II and III, where the former two types 
target DNA elements while the type III 
systems target either DNA or RNA. 3 PAM 
sequences of protospacers incorporated into 
CRISPR loci associated with type I systems 
are located at the protospacer end that 
becomes leader proximal, 18,21 " 23 whereas 
those acquired by CRISPR loci associated 
with the bacteria-specific type II CRISPR- 
Cas systems occur at the leader distal end. 8 " 
11 The first evidence implicating the PAM 
sequence in the interference mechanism 
was provided for the type II-A system of 
S. thermophilics,'''" and this result was rein- 
forced by interference experiments on other 
CRISPR-based systems. 23 " 29 

In this article, we focus attention on 
the PAM sequence and reexamine its 
potential functional versatility. Currently, 
PAMs have been predicted for CRISPR- 
based systems of a few organisms for 
which several compatible genetic elements 
have been sequenced and that carry a sig- 
nificant number of identifiable protospac- 
ers. 8 ' 11 ' 18 ' 21,22 Therefore, we focus mainly on 
the following CRISPR types and organisms 
from which many of the seminal bioinfor- 
matical and experimental results derive: 
type I-A of members of the archaeal order 
Sulfolobales, type I-E and I-F of Escherichia 
coli and type II-A of S. thermophilus. For 
the Sulfolobales, genome sequences are 
available for several organisms which carry 
large and complex CRISPR loci as well as 
being hosts for many diverse viruses and 
plasmids. 14,18,24,30 E. coli carries the relatively 
simple and streamlined type I-E and I-F 
systems that are particularly amenable to 
genetic, biochemical and structural analy- 
ses, 12,13,19 ' 23,28,31 " 33 while S. thermophilus con- 
tains a bacteria-specific type II-A system 
that yielded some of the first insights into 
the PAM dependence of spacer acquisition 
and protospacer interference. 9 " 11 Surveys of 
genetic elements of other laboratory strains 
of archaea and bacteria have yielded rela- 
tively few reliable PAM sequences. 21 ' 26 ' 34 

In this article, we examine the experi- 
mental evidence for the functional roles of 



the PAM sequence in spacer acquisition 
and interference and propose defining 
two distinct functional motifs, a spacer 
acqusition motif (SAM) for acquisition 
and a target interference motif (TIM) for 
interference. 

Characterization of 
Protospacer-Associated 
Sequences 

The first evidence for a conserved sequence 
motif adjacent to predicted protospacers on 
phages and plasmids was found for a type 
II-A CRISPR system of S, thermophilus. 
Protospacer alignments revealed a degener- 
ate sequence 5'-NNpu-py-A-A-a-3' down- 
stream from several putative protospacers. 8 
The authors implied that the similarity of 
the sequence to the conserved terminal 
repeat sequence AC AAC, except for the ter- 
minal nucleotide, might be mechanistically 
significant. However, subsequent studies 
on different S. thermophilus strains revealed 
two motifs, NNAGAAa and NGGNG, 
associated with bacteriophage protospac- 
ers that were actively acquired as spacers in 
two coexisting type II systems. However, 
the NGGNG motif showed no similarity 
to the terminal sequence AAAAC of the 
CRISPR repeat. 9 

These developments coincided with 
the first attempts by Kunin et al. 35 to clas- 
sify CRISPR-based systems on the basis 
of the sequences and inverted repeat con- 
tents of CRISPR repeats. They defined 
12 main families, some of which showed 
a distinct phylogenetic bias, in particular 
to archaea or bacteria. 35 Mojica et al. 21 
examined members of these repeat-based 
CRISPR families looking for characteris- 
tic consensus sequence motifs associated 
with predicted protospacers. Although the 
approach was limited by the difficulty in 
predicting multiple protospacers in genetic 
elements for most CRISPR-containing 
archaea and bacteria, significant num- 
bers of protospacers were found to match 
CRISPR loci of a few organisms belong- 
ing to different repeat families. These 
consensus protospacer adjacent motifs 
were then assigned the acronym PAM. 21 
The main PAM assignments from this 
study are summarized in Table 1 together 
with more recent results. Here, we present 
the sequence at the 5'-end of the crRNA 



sense strand of the protospacer (5-PAM- 
protospacer) for the type I systems and the 
opposite orientation 5 -protospacer-PAM 
for the type II systems. However, there 
are still relatively few CRISPR-carrying 
organisms for which reliable PAM 
sequences have been identified. 

Putative Role 
in Spacer Acquisition 

The acquisition process involves recog- 
nition and excision of protospacers and 
their insertion into CRISPR loci, and it 
appears to be the most conserved stage of 
the adaptive immune response. Generally, 
three proteins, Casl, Cas2 and Cas4, have 
been implicated, 4 although the type I-E 
and I-F systems of E. coli lack Cas4, as do 
type II-A and some type III-A systems. 3 
The largest and most conserved protein, 
Casl, carries DNA endonuclease activ- 
ity, 36 ' 37 and in an E. coli type I-E system, its 
mutation can inhibit spacer acquisition. 12 
Cas2 protein from Bacillus halodurans 
also exhibits dsDNA endonuclease activ- 
ity, 38 while another Cas2 protein from 
Sulfolobus solfataricus and other archaea, 
showed low specificity ssRNA endonucle- 
ase activity, currently of uncertain biologi- 
cal significance. 39,40 Cas4 of S. solfataricus 
carries 5'- to 3-DNA exonuclease activity 
that may generate recombigenic 3 -over- 
laps for CRISPR spacer insertion. 41 

A potential link between acquisition 
and the PAM sequence was provided 
earlier for the Sulfolobales by Shah et 
al. 18 ' 22 Distance trees were prepared for 
many different CRISPR loci of several 
members of the Sulfolobales based on 
sequences of CRISPR repeats, leaders 
and Casl proteins. Each tree showed 
three similar major branches containing 
components (repeat, leader or Casl pro- 
tein) associated with the same CRISPR 
loci. Moreover, prediction of sequence 
motifs (PAMs) adjoining putative pro- 
tospacers exhibiting significant sequence 
matches to spacers within the different 
CRISPR loci revealed strong biases to 
CCN, TCN and GTN, respectively, for 
the three main branches. 18 ' 22 This sug- 
gested that in addition to Casl, the pro- 
tospacer motif, repeat and leader were 
involved in acquisition. These results are 
updated in Figure 1A for the repeat and 
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Table 1. Summary of experimental data relating to the dependence of DNA interference on PAM 



Organism 


Subtype 


PAM 


Interference (+) 


Interference (-) 


Reference 


5. solfataricus 


I-A1 


CCN 


CCA TCA 


GAC TTA 


18, 24 


5. solfataricus 


I-A2 


TCN* 


TCG 




30 


H. volcanii 


l-B 


n.d. 


T — tY~ At — 1 — TA A TATTAT t~hf~ 

I1LALI IAA IAI 1 AG LAG 


remaining 58 trinucleotides 


26 


H. walsbyi 


l-B 


TTC* 


n.d. 




56 


E. coli 


l-E 


AWG 


A 1 G A AG GAG AGG 


Ke~(~ f~ At~ TA/— 1 — Vt~ ~V(~t~ T — V(~ l~(~t~ t — Vt~ A A A A Af~ A AT Kf~f~ ATA 

ALG GAG 1 AG GIG 1 GG 1 1 G LCG L 1 G AAA AAL AA 1 AGL A 1 A 
ATC ATT GAA CAA CCT CCC GGC TCC TCT 


13, 23, 32, 
33 


E. coli 


l-F 


CC 


GCC CCC GCT CTT CAA 


GTC AAA GTT GGG ACA GCA AAT AAC 


28 


P. aeruginosa 


l-F 


cc 




AG 


58 


5. thermophilus 


ll-A 


N NAG A A 


AGAA 


AGAG AAAA ATAA 


11 


5. agalactiae 


ll-A 


NGG 


NGG NGA 


NAC NCA NAG NGT 


27 


5. thermophilus 


ll-A 


NGGNG 


GGNG 


CGTG GCTG GGTC 


25 



CRISPR subtypes are given for each organism. *, indicates that the PAM was determined from sequence alignments. Interference dependence on per- 
mutations of the PAM are summarized in the interference columns where (+) indicates successful interference and (-) denotes no or little interference. 
All triplet sequences are drawn 5' to 3'. Literature references describing the original results are provided, n.d., not determined. 



Casl protein sequences, including many 
new sequences. The tree reveals that the 
casl gene of a given CRISPR subtype 
always occurs together with genomic 
CRISPR arrays of the same subtype. A 
single exception is Metallosphaera cup- 
rina, which probably results from the 
occurrence of an IS element-mediated 
transposition between the casl gene and 
the adjacent CRISPR array. Furthermore, 
logoplots of the predicted PAMs reinforce 
that there is a close correlation between 
the sequence identity of PAM and the 
CRISPR subtype (Fig. IB). These results 
reinforce and extend the earlier sequence 
analyses 18,22 demonstrating the strong 
interdependence of the type of CRISPR 
array, the Casl protein and PAM and 
they correlate with the contemporary 
evidence for coevolution of the PAM 
sequence and CRISPR repeat families 
(Table 1). 21,34 

Further support for the involvement 
of the leader in acquisition came from 
the observation that no spacer uptake was 
observed in the leaderless CRISPR locus 
F that is highly conserved in sequence 
between different S. solfataricus strains, 
whereas all other CRISPR loci carrying 
leaders acquired new spacers. 14,17,18 More 
direct evidence for a leader role in acqui- 
sition was provided recently by Yosef et 
al. 12 who demonstrated that an unknown 
sequence located within the first 60 bp of 
the leader, adjacent to the first CRISPR 
repeat, was essential for spacer acquisition 
in an E. coli type I-E system and the size of 



this important region was further reduced 
to 43 bp in an independent study on this 
type I-E system. 42 

At present, little is known about the 
detailed mechanisms of spacer acquisition. 
The PAM sequence is likely to generate a 
recognition site for type I protospacer exci- 
sion from genetic elements, or fragments 
thereof, with cleavage occurring adjacent 
to the PAM sequence (see below). At the 
other end of the protospacer there is no 
detectable sequence specificity for cleav- 
age. 14 Moreover, multiple cutting sites 
can occur over up to six base pairs for a 
given protospacer region in different cop- 
ies of the same genetic element. 14 This led 
Erdmann and Garrett 14 to propose that a 
ruler cleavage mechanism occurs for pro- 
tospacer excision measured from the PAM 
sequence. Such a mechanism is also consis- 
tent with the observation that many pro- 
tospacers contain internal PAM sequences 
that can also be recognized, indepen- 
dently, during protospacer excision from 
other copies of the same genetic element. 14 
Diez-Villasenor et al. 42 have also proposed 
a second ruler mechanism operating at the 
spacer insertion stage whereby an initial 
cleavage occurs at the leader-repeat bound- 
ary of a CRISPR locus with a secondary 
cut occurring at the leader distal end of 
the first repeat. 42 This shared ruler strat- 
egy during the two main acquisition steps 
could ensure maintenance of the regular 
periodicity within CRISPR loci. 42 

Evidence has also been presented for the 
occasional uptake of spacers in a reverse 



direction, for a type I-A system of S. solfa- 
taricus? 4 a type II-A system of Streptococcus 
agalactiae 17 and a type I-E system of E. 
coli 41 and this places some constraints 
on possible mechanisms of spacer inser- 
tion. 43 Moreover, it may be significant 
for understanding details of the insertion 
mechanisms employed in S. solfataricus 
and Streptococcus agalactiae type I-A and 
type II-A systems, respectively, that PAM 
sequences located at opposite ends of the 
protospacer generate the same sequences 
when inverted 5'-CCN vs. NGG-3'. 

A different picture is emerging from 
recent studies on a genetically manipulated 
E. coli type I-E system that carries a single 
gene cassette encoding acquisition and 
interference Cas proteins. 44 " 46 Yosef et al. 12 
first induced spacer acquisition by over- 
expressing proteins Casl and Cas2, and 
found relatively low conservation at posi- 
tions -3 and -2 in the AWG PAM, whereas 
position -1G was highly conserved. They 
also provided evidence for the new repeat 
being copied from repeat 1 during spacer 
acquisition. Swarts et al. 13 demonstrated 
examples of compensatory base changes 
between the -1 PAM position and the ter- 
minal downstream nucleotide of the repeat 
at which spacer insertion occurred. They 
inferred that the PAM -1 nucleotide is 
taken up in the newly synthesized repeat, a 
conclusion that was supported by Datsenko 
et al. 32 Goren et al. 47 took this a stage fur- 
ther proposing that the -1 position of the 
PAM should be considered the terminal 
nucleotide of the protospacer sequence and 
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Figure 1. Coevolution of sequences of CRISPR repeats, Cas1 proteins and PAMs, for members of the Sulfolobales. The type I CRISPR systems fall into 
three main subtypes, l-A, l-D and l-B. The l-A systems are the most common and can be classified into distinct subfamilies l-A, and l-A . (A) A neighbor 
joining tree of CRISPR repeat sequences (left) is juxtaposed with that of the translated sequences of cas1 genes (right). CRISPR loci are identified by the 
short name of the organism and the number of CRISPR repeats, casl genes are colored according to the CRISPR subtypes (l-A,, blue; l-A , pink; l-D, yel- 
low and l-B, green). (B) Protospacer matches from spacers of the four distinct subtypes of CRISPR arrays yield dominant consensus PAMs, CCN for l-A,, 
TCN for l-A , GTN for l-D and with ATTA for two protospacers predicted for subtype l-B. Motifs were derived from spacer-protospacer matches on viral 
or plasmid genomes of the Sulfolobales exhibiting five or less mismatches. The total number of predicted protospacers is given in brackets. 



not a part of the repeat. Mojica et al. 21 had 
earlier identified CRISPR systems of other 
organisms for which the -1 PAM position 
is conserved, including some type I-C, I-B 
and I-F systems. For some of them, the -1 
PAM position matches the first nucleotide 
of the repeat and could also, potentially, be 
assigned to the protospacer. 21 

In summary, whereas the results for the 
type I-E system of E. coli provide some 
insights into how specific spacer insertion 
into CRISPR repeats can occur by exploit- 
ing the -1 position of the PAM sequence, 
this mechanism cannot be generally appli- 
cable because for many type I and type II 
systems, the equivalent PAM position is 
not conserved. 14,21 

PAM-Protospacer Selection 
on Genetic Elements 



It was estimated from statistical analyses 
of distributions of predicted protospac- 
ers, in diverse viral and plasmid genomes 
of the Sulfolobales, that they were located 
randomly on both circular and linear 
genomes. 22,48 They exhibited no significant 
bias with respect to either direction or to 
their location within protein coding or non- 
coding regions. This study also provided 
support for the PAM sequence being inde- 
pendent of the type of genetic element from 
which the protospacer originated because 
a high incidence of putative protospacers 
from both linear viral genomes and circu- 
lar (inferred to be positively supercoiled 4 ') 
plasmid and viral genomes occurred within 
the same CRISPR loci. 18 In an extensive 
study of bacteria and archaea, Mojica et 
al. 21 reached similar conclusions for a wide 
range of genetic elements. 

Recently, hundreds of spacers acquired 
from a conjugative plasmid co-infecting 



S. solfataricus with a tailed-fusiform virus 
were sequenced and analyzed for their 
PAM sequences. 14 Originally, these results 
were presented for a few smaller, unlinked, 
contigs but here, 399 unique protospacers 
are reanalysed for a large 22 kb contig of 
the plasmid from Monument Geysir Basin- 
Yellowstone National Park (herein named 
pMGBl), and the data are presented in 
Table 2. The results quantify the high level 
of conservation of the acquisition CCN 
PAM sequences (95%). Moreover, they 
reaffirm the conclusion from the earlier 
bioinformatical analyses that protospacers 
occur randomly throughout the targeted 
genetic element on both strands with no 
significant bias to predicted protein cod- 
ing regions. 14,22,48 The random distribution 
of acquired protospacers is also consistent 
with the results obtained for Streptococcus 
type II systems 9,11,50 and the genetically 
modified E. coli type I-E system. 12 ' 13 
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There are, nevertheless, significant dif- 
ferences between results obtained for the 
type I-A system of Sulfolobus and the type 
I-E system of E. coli. First, the -1 PAM 
position is only conserved for the latter 
system while the -2 and -3 PAM positions 
for the type I-A system of Sulfolobus are 
much more highly conserved than in the 
type I-E system. 12 " 14 Second, when mul- 
tiple protospacers are incorporated into 
a CRISPR locus of a single clone they 
derive from unidirectional protospacers 
on a genetic element for the E. coli type 
I-E system but this was not observed for 
the Sulfolobales. 14 This is exemplified by 
an in silico analysis of the type I-A system 
of Metallosphaera sedula, where the orien- 
tations of 30 putative protospacers match- 
ing (with < 5 nt mismatches) the genome 
of the Acidianus two-tailed virus, ATV, 
are arranged bidirectionally for sections of 
three CRISPR loci (Fig. 2). 13,14,32 

Datsenko et al. 32 provided a rationale 
for unidirectional uptake of protospacers 
within a specific CRISPR locus of the type 
I-E system. They hypothesized that low 
level annealing of crRNAs from older spac- 
ers with newly invading genetic elements 
can stimulate ("prime") acquisition of new 
spacers from that element. This implicit 
coupling of interference and acquisition is 
indirectly supported by the co-expression 
of the acquisition proteins Casl and Cas2 
and interference-related Cas proteins in 
that type I-E system. Moreover, the miss- 
ing Cas4 DNA exonuclease, 40 which is 
implicated in acquisition for other CRISPR 
types, 46 may be complemented by the inter- 
ference Cas3 endonuclease. 32 Consistent 
with this proposal, in the type I-F system 
of Pectobacterium atrosepticum, it has been 
shown that Casl interacts with a Cas2- 
Cas3 hybrid protein. 51 A major advantage of 
reverse coupling of interference and spacer 
acquisition 32 could be that it provides a 
means of preferentially selecting genetic 
elements for interference and thereby facili- 
tates avoidance of chromosomal interfer- 
ence, although priming by a low level of 
crRNA base pairing could still lead to for- 
tuitous targeting of chromosomal sites. 

If this hypothesis were more generally 
applicable to other CRISPR acquisition 
systems, it could resolve an earlier puzzle as 
to why individual CRISPR loci often carry 
multiple spacer matches against single 



genetic elements. 17,18 Although it has been 
demonstrated experimentally that acquisi- 
tion of more than one spacer from an invad- 
ing genetic element can provide increased 
immunity against that element for both the 
type II-A system of S. thermophilus >,u and 
the type I-E system of E. coli} 1 CRISPR 
loci often carry many spacers matching a 
given element. 17 ' 18,52 ' 53 This phenomenon 
is exemplified by the 30 spacers predicted 
to match the lytic virus ATV 54 within 
three CRISPR loci of the crenarchaeon 
M. sedula (Fig. 2). Another explanation 
for multiple CRISPR spacers matching a 
single genetic element, especially for the 
crenarchaea where many viruses coexist in 
stable relationships with their hosts, is the 
possibility that CRISPR systems adopt a 
regulatory role by exhibiting limited levels 
of interference. 17,18 Almendros et al. 28 have 
also recently emphasized the cellular disad- 
vantages of CRISPR-based systems being 
too efficient and rejecting potentially ben- 
eficial foreign DNA. 

Protospacer Recognition 
during Interference 

While there is strong support for the 
involvement of PAM sequences in the 
spacer acquisition step, their importance 
for interference is less clear. Barrangou et 
a | _9,ii,55 p r0 vided the first evidence impli- 
cating PAM sequences in this stage of the 
immune response. They demonstrated 
for the type II-A system of S. thermophi- 
lus that mutations in the AGAA motif, 
located downstream from the protospacer, 
allowed phages to avoid CRISPR defense. 
Moreover, experiments involving targeting 
of plasmid protospacers in Sulfolobus pro- 
vided support for a PAM sequence role in 
interference. 24 In the presence of the CCN 
PAM sequence, interference was effec- 
tive with the few surviving transformants 
primarily carrying deletions in CRISPR 
loci that included the matching spacer. 
When the PAM sequence was replaced 
with GGN, GAN or TTN, there was no 
detectable interference. However, in the 
presence of TCN, (and to a lesser degree 
CTN) there was a significant reduction in 
transformation efficiency consistent with 
an intermediate level of targeting. This 
contrasted with the stringent acquisition 
PAM sequence-dependence (Table 2) and 



Table 2. A summary of protospacer acquisi- 
tion results from 399 sequenced non identical 
protospacers on a 22 kb contig of the Sulfolo- 
bus conjugative plasmid pMGB1 by subfamily 



I-A, CRISPR loci C,D and E 


in S. solfataricus P2 


Protospacer properties 


Protospacers (%) 


forward 


52 


reverse 


48 


CCN PAM 


95 


"inverted" CCN PAM 


0.5 


no PAM 


4.5 



The data are derived from an experimental 
study by Erdmann and Garrett. 14 The designa- 
tions "forward" and "reverse" are arbitrary. 
"No PAM" includes a variety of different 
dinucleotide sequences, including CTN and 
TCN. 

suggested an altered mode of PAM rec- 
ognition occurring during interference, 
where possibly a C, at position -2 or -3 was 
sufficient (Table l). 12 " 14 

The first systematic analysis of the 
interference motif was performed on the 
haloarchaeon Haloferax volcanii, employ- 
ing a similar plasmid-targeting approach 
to Sulfolobus. 16 Although no specific PAM 
sequence has been identified for the type 
I-B CRISPR-Cas system of this organism, 
it was demonstrated that in total, six proto- 
spacer-adjacent triplets TTC, ACT, TAA, 
TAT, TAG and CAC, out of the 64 possible 
triplets tested, rendered protospacers active 
for targeting. 26 None of the triplet positions 
is completely conserved, and only positions 
-2 and -3 show limited conservation, sug- 
gesting that the PAM sequence might be 
TAN. Moreover, TTC PAM has been pre- 
dicted for the type I-B system of another 
haloarchaeon, Haloquadratum walsbyi? 6 In 
summary, these results on the archaeal type 
I-A and I-B systems are consistent with 
different PAM recognition at the DNA 
interference stage of type I systems, pos- 
sibly limited to one nucleotide, with some 
sequence permutations being permitted. 

A similar conclusion was recently 
reached by Lopez-Sanchez et al. 27 for a 
type II-A system of S. agalactiae, where 
the efficiency of transformation of plasmid 
constructs carrying protospacers was also 
studied. Whereas no transformants were 
formed when the downstream PAM NGG 
was present, no interference occurred when 
the dinucleotide was converted to AC, CA, 
AG or GT. However, the dinucleotide GA 



www.landesbioscience.com 



RNA Biology 



895 



Msed56 11 ^TODTO> 24 

• • • 

Msed151 ife^immiW^^ 

• • • 

Msed162 '^^^^^^^ 



Figure 2. In silico determination of multiple spacer matches to the bicaudavirus ATV in three CRISPR loci of the crenarchaeon M. sedula. Repeat-spacer 
units from sections of type l-A, CRISPR arrays are depicted as arrowheads, directed away from the leader. Numbers to the left and right delineate the 
range of repeat-spacer units depicted. Shaded units yield close matches (< 5 nt mismatches) to protospacers in ATV. The orientation of the matching 
spacers with respect to the ATV genome is indicated by a dot above or below the shaded arrowheads. 
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was equal to GG in reducing transforma- 
tion efficiency, consistent again with a 
reduced level of PAM recognition specific- 
ity operating during interference. 

Almendros et al. 28 have taken this a stage 
further by demonstrating clear differences 
between the acquisition and interference 
protospacer motifs for a constitutive type 
I-F system in E. coli. They showed that a C 
located either at PAM position -2 or imme- 
diately upstream of the PAM (at position 
-3) resulted in interference, while the pres- 
ence of both C's produced enhanced inter- 
ference effects (Table 1). 

The CRISPR system that diverges from 
this emerging consensus is the genetically 
manipulated type I-E system of E. coli 
where the PAM triplet, or more precisely 
the -2T and -3A positions, since the -1G 
position lies within the protospacer in 
this system, 4,13 appears to be critical for 
interference. Three of six possible single 
nucleotide mutations of the A and T were 
shown to produce strongly reduced inter- 
ference. 23 Semenova et al. 23 demonstrated 
further that in this system, maintenance 
of seven of the first eight adjacent base 
pairs of the crRNA-protospacer hybrid (a 
"seed" sequence) were essential for effec- 
tive interference. This highly specific 
interaction of the type I-E interference 
complex may also explain why the trans- 
formants generally evade interference via 
point mutations in the protospacer or 
PAM sequence, whereas in several other 
type I and type II systems and a type III-B 
system, elimination of matching CRISPR 
spacers or loss or mutation of Cas or 
Cmr proteins, are a much more common 
response to interference. 24 ' 26,27,57 " 60 

Structural studies on the E. coli type 
I-E system provided evidence for type I-E- 
specific CasA protein, a predicted Cas8 



homolog, interacting via a loop region 
with the PAM sequence located on the 
DNA strand complementary to the crRNA 
rather than to dsDNA or the non-targeted 
DNA strand. 29 Furthermore, the finding 
that the PAM sequence was important for 
initial binding of the interference complex 
to the protospacer, and not the targeted 
DNA strand, led to the proposal that the 
PAM sequence facilitates protospacer rec- 
ognition. 29 In contrast, in a parallel study 
of the type II system of Streptococcus pyo- 
genes, Jinek et al. 61 demonstrated that the 
PAM sequence was recognized exclusively 
on the opposite, non-complementary DNA 
strand during interference, consistent with 
the operation of fundamentally different 
molecular interference mechanisms in the 
type I-E and type II systems. 

PAM Sequences 
and Type III Systems 

The best characterized type III inter- 
ference systems are the type III-A Csm 
DNA-targeting system of Staphylococcus epi- 
dermidis 61,63 and a type III-B RNA tar- 
geting system of the archaeon Pyrococcus 
furiosus. 20,64 No PAM-dependent spacer 
acquisition data are available for these 
organisms but experimental evidence sug- 
gests that specific PAM sequences are not 
recognized during interference. For the 
type III-A Csm system of S. epidermidis, 
evidence was presented that mismatched 
base pairing between the 5'-tag of the 
crRNA and the PAM region was sufficient 
to ensure interference. 63 Moreover, an anti- 
sense CRISPR RNA targeted by the type 
III-B Cmr system of P. furiosus was cleaved 
despite perfect matching of the 5'-tag of the 
crRNA to the antisense RNA substrate. 20 
Similar evidence for PAM-independent 



interference was obtained for another 
RNA-targeting type III-B Cmr system of 
S. solfataricus^ and for a different type 
of III-B Cmr system of Sulfolobus islandi- 
cus putatively implicated in transcription- 
dependent DNA targeting. 60 

Thus, there appears to be no depen- 
dence of type III interference on PAM 
sequences. Indeed, among 126 available 
archaeal genome sequences (www.ebi. 
ac.uk/genomes/archaea.html), from early 
2012, a total of 89 type III systems were 
represented, 51 of which are present as 
stand-alone gene cassettes and only 13 
were linked exclusively to acquisition gene 
cassettes. 45 Therefore, these independent 
modules must function by utilizing spacers 
accumulated by type I acquisition systems 
in archaea, or by type I or type II acquisi- 
tion systems in bacteria. 60 Consistent with 
this inference, Deng et al. 60 have demon- 
strated experimentally that a type III-B 
CRISPR interference module of S. islan- 
dicus can share crRNA processing Cas6 
and specific spacers with co-existing type 
I systems. 60 

In summary, this widespread chro- 
mosomal uncoupling of type III interfer- 
ence modules from acquisition modules, 
as well as their lack of dependence on 
PAM sequences, seems to be a precondi- 
tion for the occurrence of interference 
module exchange between organisms and 
the functional coupling to non-cognate 
CRISPR loci as well as to other types of 
acquisition modules. 4 

Intracellular Co-existence 
of Similar CRISPR Types 
Recognizing Different PAMs 

Organisms often carry different CRISPR- 
based systems. For example, S. thermophilus 
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Figure 3. Overview of putative PAM, SAM and TIM interactions during acquisition and interference in type I and type II CRISPR systems. (A) The spacer 
acquisition motif (SAM) is recognized on the invader DNA by the Cas protein acquisition complex, which leads to the protospacer being excised by a 
putative ruler mechanism 14 and reinserted into a CRISPR locus by another putative ruler mechanism. 42 During interference by type I systems the target 
interference motif (TIM), on the crRNA-complementary DNA strand, is recognized by the Cas protein-crRNA complex where both TIM recognition and 
crRNA annealing are required for successful invader cleavage. 29 (B) In type II systems, the SAM/PAM motif is inferred to be recognized by a mechanism 
related to the type I system but inverted on the dsDNA whereas TIM recognition occurs on the non-complementary DNA strand to the crRNA. 



contains types I, II and III-A systems, 10,25 
and many thermophilic archaea exhibit 
type I and type III systems. 4 The diverse 
type III systems are likely to provide the 
host with a variety of interference options, 
often by sharing CRISPR loci of type I and 
II systems, and sometimes utilizing their 
CRISPR RNA processing enzymes. 41 ' 60 ' 64 
There is also evidence of co-functionality of 
different subfamilies of type I interference 
complexes with different PAM sequences. 
For example, a subfamily I-A^specific 
interference protein Cas7, encoded adjacent 
to CRISPR loci C and D (PAM - CCN) 
of S. solfataricus, was found to be com- 
plexed with crRNAs from both subfam- 
ily I-Aj CRISPR loci and from subfamily 
I-A 2 CRISPR loci A and B (PAM-TCN). 66 
Moreover, a subfamily I-Aj interference 
complex was shown to target protospac- 
ers with the subfamily I-A PAM. 24 These 
observations further reinforce the view 
that PAM sequence recognition during 



interference differs from that occurring in 
the acquisition step. 

Such flexibility of PAM sequence rec- 
ognition during interference potentially 
renders the immune systems more versa- 
tile in that invading genetic elements will 
be unable to avoid targeting by incurring, 
for example, a single nucleotide mutation 
in a PAM sequence. Furthermore, this 
versatility can be extended for an organ- 
ism by accumulating diverse interference 
modules, especially those of type III, 
which appear to exhibit a range of differ- 
ent targeting mechanisms, some of which 
remain to be elucidated. 20 ' 60 ' 64 ' 65 ' 67 

Conclusions and Perspectives 

Both spacer acquisition and interference 
are dependent to some extent on PAM 
sequences. Currently, we know little about 
the molecular mechanism of PAM recog- 
nition during spacer acquisition, although 



preliminary evidence suggests that an ini- 
tial, specific cleavage occurs downstream 
from the PAM in the type I-A system of 
Sulfolobus with a secondary cut directed 
by a ruler mechanism. 14 In contrast, dur- 
ing interference, the PAM sequence is rec- 
ognized on opposite DNA strands for type 
I-E and type II-A systems. 29,61 Since spacer 
acquisition and protospacer interference 
must utilize fundamentally different 
molecular mechanisms, as envisioned in 
Figure 3, we consider that features of the 
PAM sequence recognized in these pro- 
cesses should be defined separately. Our 
proposal is to retain the acronym PAM for 
the conserved signature sequence and we 
prefer protospacer associated motif to the 
originally proposed protospacer adjacent 
motiP 1 because some PAMs have recently 
been shown to include protospacer nucle- 
otides. 12 ' 13 ' 42,47 Further, we suggest using 
the acronyms SAM for spacer acquisition 
motif and TIM for the target interference 
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motif. These motifs are defined further, 
and justified separately, below. 

Protospacer associated motif (PAM). 
Consensus conserved sequence motif 
occurring at one end of predicted protospac- 
ers for each CRISPR-based system. PAMs 
are identified by sequence alignments of 
genetic elements containing sequences that 
match spacers within individual CRISPR 
arrays. The identity of the consensus PAM 
sequence probably depends on two molecu- 
lar processes: (1) motif recognition by the 
spacer acquisition protein complex and 
(2) the subsequent selection of protospac- 
ers for targeting (i.e., those targeted more 
efficiently probably due to their exhibiting 
optimal interference motifs). 

Spacer acquisition motif (SAM). 
Functional motif associated with a proto- 
spacer and recognized by the spacer acqui- 
sition machinery of each CRISPR-based 
system prior to protospacer excision. At 
present, the mode of recognition of the 
PAM sequence, and the DNA strand (s), 
remain unknown. Multiple SAMs may 
occur for a given PAM but the predomi- 
nant or consensus SAM is likely to match 
the PAM but may be DNA-strand specific. 

Target interference motif (TIM). 
Functional motifs associated with a proto- 
spacer and recognized by the DNA inter- 
ference complex for each type I and type 
II CRISPR-Cas system. Multiple TIMs 
can occur for a single PAM as has been 
demonstrated experimentally for type I-A, 
I-B, I-E and type II-A systems, and the 
sequences are strand specific. 29 ' 61 

In conclusion, we have used the order 
5-PAM-protospacer throughout this 
article for CRISPR type I systems, and 
protospacer-PAM-3' for type II systems for 
defining the PAM sequence. In a sense, the 
strand selection for the PAM sequence is 
arbitrary, and both orientations are widely 
used. 18,21,68 However, this lack of strand 
specificity is unlikely to apply to SAM 
and does not apply to TIM, for which the 
recognized motif is located on the crRNA 
complementary DNA strand, for a type I-E 
system 29 and on the non-complementary 
strand for a type II-A system 60 (Fig. 3). 
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